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SCC77 33801 bp DNA BCT 10-MAY-2000 

Streptomyceo coelicolor cosmid C77. 

AL136503 

AL136503.1 GI:6714747 

adenosine deaminase; carbohydrate kinase; dehydratase; 
dihydrodipicolinate synthase; DNA-binding protein; DnaJ protein; 
dnaJ2; Era-like GTP-binding protein; GTP-binding protein; 
heat-inducible transcriptional repressor; Hit-family protein; hrcA; 
hydrolase; IclR-family transcriptional regulator; lepA; 
lipoprotein; long-chain fatty-acid CoA ligase; oxidoreductase; 
oxygen-independent coproporphyrinogen III oxidase; protease; 
transmembrane efflux protein; transmembrane transport protein. 
Streptomyces coelicolor A3 (2). 
Streptomyces. coelicolor A3 (2) 

Bacteria; Firmicutes ; Actinobacteria; Actinobacteridae; 
Actinomycetales; Streptomycineae; Streptomycetaceae; Streptomyces . 

1 (bases 1 to 33801) 

Redenbach,M. , Kieser,H.M., Denapaite, D. , Eichner,A. , Cullum, J . , 
Kinashi,H. and Hopwood,D.A. 

A set of ordered cosmids and a detailed genetic and physical map 
for the 8 Mb Streptomyces coelicolor A3 (2) chromosome 
Mol. Microbiol. 21 (1), 77-96 (1996) 
97000351 

2 (bases 1 to 33801) 
Oliver, K. and Harris, D. 
Unpublished 

3 (bases 1 to 33801) 

Thomson, N.R. , Parkhill,J., Barrell,B.G. and Ra j andream, M. A. 
Direct Submission 

Submitted ( 17- JAN-2000) Streptomyces coelicolor sequencing project, 
Sanger Centre, Wellcome Trust Genome Campus, Hinxton, Cambridge 
CB10 ISA E-mail: barrell@sanger.ac.uk Cosmids supplied by Prof. 
David A. Hopwood, [3] John Innes Centre, Norwich Research Park, 
Colney, Norwich, Norfolk NR4 7UH, UK 



alignment_scores : 

Quality: 397.00 

Ratio: 1.829 

Percent Similarity: 59.780 



Length: 363 
Gaps: 10 
Percent Identity: 33.609 



alignment_block : 
US-09-477-962-115 x SCC77/rev 

Align seg 1/1 to reverse of: SCC77 from: 1 to: 33801 

11 IleProAlalleArgGluAlaLeuGlyAspGluLysAspProArgLeuAl 27 
:::|||||| :::||||||::: Ml III:: 

26678 CTGCCCGCG TCCGCGCTCGCCGGGGCCGCCGACCGCCCCCTCGG 26635 



27 aLeuTyryalHisValProPheCysSerSerLysCysHisPheCysAspT, 44 _ 

afr^W^lW 1 sss 111 ^ : :::::! I I v|J I Ml.:.' ~); ' ', k ^ ...^^f^. 



58 ArgGluArgSerProTyrValThrAlaLeuCysAspGlnlleArgPheTy 74 
:::'U|::": III::: :::||| III :::::: II I 
26537 GCCTCCCGCGACAACTACGCCGACACCCTCGTCGACGAGGTCCGC 26493 

74 rGlyProGlnLeuThrArg LeuGlyTyrArgPro G 86 

Ml:::lll llllll Ml : 

26492 ......... . CTGGCCCGCAAGGTGCTCGGCGACGACCCCCGCGAGGTCC 26453 

86 luValMetTyrTrpGlyGlyGlyThrProThrArgLeuThrGlyAspGlu 102 

::::::::::: I I I I I I i I I I I I II I I I I I I | : : : : :: : : : 
26452 GCACGGTCTTCGTCGGCGGCGGTACGCCCACCCTGCTGGCCGCCGGCGAC 26403 

103 MetThrAlaValHisGlnAlaLeuAspAspAlaPheAspLeuThrGlyLe 119 

:::::: : : : III::: III III III::: 

26402 CTGGTGCGGATGCTGGGCGCGATCCGCGACGAGTTCGGCCTGGCACCGGA 26353 

119 "uArgGlnTrpSerValGluSerThrProAsnAspLeuAspProAlaThrL 136 
::: :::::: I I I :::::: I I I ::::::::: I I I I I I I I I I 
26352 CGCGGAGATCACCACGGAGGCCAACCCGGAGTCCGTCGACCCGGCGTATC 26303 

136 euAspThrLeuArgGlyLeuGlyValThrArgValSerValGlyValGln 152 

II I I I I I I I I I : : : III : : : I I I : : : I I I I I I : : : I I I 
26302 TCGCCACCCTCCGCGCGGGCGGCTTCAACCGGATCTCCTTCGGCATGCAG 26253 

153 SerLeuAsnProTyrGlnLeuArgLysAlaGlyArgAlaHisSerArgGl 169 

III : : : : : : III::: | | | : : : | | | : : : 

26252 AGCGC CAAGCAGCAC GT C CT GAAGAT C CT GGACC GCAC C CACAC C C C G GG 26203 

169 uGlnAlaLeuAlaAlaValProLeuLeuArgArgAlaGlylleAspGluP 186 

: :: I I I : : : I II III I I I M I : : : I I I : : : 

26202 ACGCCCCGAGGCCTGTGTCGCCGAGGCCCGCGCGGCCGGCTTCGACCACG 26153 

186 he As nVa lAs ^ Leu 1 1 eAl a Gl "Phe ProGl "Gl uAl a Va iGluSerPhe 202 
I I I : : : llTl I I I I I III I I I I I I I I I : : : ::::::::: 
26152 TCAACCTCGACCTGATCTACGGCACCCCCGGCGAGTCCGACGACGACTGG 26103 

203 GluGluThrLeuArgThrValLeuAlaLeuAspProProHisValSerVa 219 
: : : : : : I I I :::::: I I I : : : III I I I I II I I I : : 

26102 CGGGCCTCCCTGGACGCCGCGCTCGGCGCCGGACCCGACCACGTCTCGGC 26053 

219 lTyrProTyrArgAlaThrProLysThrValMetAlaMetGlnLeuAspA 236 
: I I I : : : III : : : II I :::::: I 

26052 GTACGCCCTGATCGTCGAGGAGGGCACCCAGCTCGCCCGCCGCATCCGGC 26003 

236 rgGluPheValGluAlaArgAsnArgAspGlyMetlleAspAlaTyrGlu 252 

II III :: : II I III II I 

26002 GCGGCGAGGTCCCGATGACCGACGACGACGTGCACGCCGACCGGTACCTG 25953 

253 ArgAlaMetAlaAlaLeuGlyAlaAlaGlyTyrHisGluTyrCysHisGl 269 

, , M I I I I I I I I I ::: I I I I I I I I I I I I III : : 

2 5952 ATC<^ 25 903 

26^^T^Tr^^3^FfAisp. .... ; . . . . . . AlaArgHisGluAspGlnAspG 282 

. ^ : »^ • l^lfi;: : : : : : : : : | | I 
259012 CAACTGGGCCACCTCCGACGCGGGGCGCTGCCTGCAG . . . . . 25866 



282 lyAsnTyrLysTyrAspLeuAlaGlyAspLysTleGlyPheGlySerGly 298 



25865 . .AACGAGCTGTACTGGCGGGGCGCCGACTGGTGGGGCGCGGGACCGGGC 25818 

299 AlaGluSerllelleGlyHisHisLeuLeuTrpAsn. GluAsnSe 313 

|||:::||| :::||| I I I I I I 

25817 GCGCACTCCCACGTGGGGGGCGTGCGGTGGTGGAACGTGAAGCACCCGGG 25768 



313 rAlaTyrAlaArgTyrLeuLeuAlaProArgGluPheSerAlaAlaHisA 330 

= 111111111 III III :::::: : : : I I I : : : : : : : 

25767 GGCGTACGCGGGGGCGCTGGCGGCGGGCAAGTCGCCGGGCGCGGGGCGCG 25718 

330 rgPheThrThrAlaGluProAspArgLeuThrAlaProValGlyGlyAla 346 

: : : : : III Ml I II I I I : : : ' : : : 
25717 AGATCCTCACGGACGAG . . . GACCGGCGCGTGGAGCGCATCCTGCTGGAG 25671 



347 LeuMetThrArgGluGlyValValPheAlaArgPheArg 359 

Ml Ill :::::: :::IM 

25670 CTGCGCCTGCGGGAGGGCGTCCCGCTGTCGCTGCTGCGG 25632 
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□ 1: AL136503. Streptomyces coeL.[gi: 67 14747] Related Sequences, Protein, PubMed, Taxonomy 
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SCC77 33801 bp DNA linear BCT 10-MAY-2000 

Streptomyces coelicolor cosmid C77. 

AL136503 

AL136503.1 GI:6714747 

adenosine deaminase; carbohydrate kinase; dehydratase; 
dihydrodipicolinate synthase; DNA-binding protein; DnaJ protein; 
dnaJ2; Era-like GTP-binding protein; GTP-binding protein; 
heat-inducible transcriptional repressor; Hit-family protein; hrcA; 
hydrolase; IclR-family transcriptional regulator; lepA; 
lipoprotein; long-chain fatty-acid CoA ligase; oxidoreductase; 
oxygen-independent coproporphyrinogen III oxidase; protease; 
transmembrane efflux protein; transmembrane transport protein. 
Streptomyces coelicolor A3(2). 
Streptomyces coelicolor A3 (2) 

Bacteria; Firmicutes; Actinobacteria; Actinobacteridae; 
Actinomycetales; Streptomycineae; Streptomycetaceae; Streptomyces . 

1 (bases 1 to 33801) 

Redenbach,M. , Kieser,H.M., Denapaite, D . , Eichner,A., Cullum, J . , 
Kinashi,H. and Hopwood,D.A. 

A set of ordered cosmids and a detailed genetic and physical map 
for the 8 Mb Streptomyces coelicolor A3 (2) chromosome 
Mol. Microbiol. 21 (1), 77-96 (1996) 
97000351 

2 (bases 1 to 33801) 
Oliver, K. and Harris, D. 
Unpublished 

3 (bases 1 to 33801) 

Thomson, N. R. , Parkhill,J., Barrell,B.G. and Ra jandream, M. A. 
Direct Submission 

Submitted (17-JAN-2000) Streptomyces coelicolor sequencing project, 
Sanger Centre, Wellcome Trust Genome Campus, Hinxton, Cambridge 
CB10 ISA E-mail: barrell@sanger.ac.uk Cosmids supplied by Prof. 
David A. Hopwood, [3] John Innes Centre, Norwich Research Park, 
Colney, Norwich, Norfolk NR4 7UH, UK 
Notes: 

Streptomyces coelicolor sequencing at The Sanger Centre is funded 
by the BBSRC and Beowulf Genomics 

Details of S. coelicolor sequencing at the Sanger Centre are 
available on the World Wide Web. 

(URL; http : / /www. Sanger . ac . uk/ Pro j ects/S_coelicolor/ ) 

CDS are numbered using the following system eg SC7B7. 01c. SC (S. 

coelicolor) , 7B7 (cosmid name), .01 (first CDS), c _( complementary 

strand) . , 

The more significant matches, with motif s in the PROSITE database 

are also included but some of these may be fortuitous. 

The length in codons is given for each CDS. 

Usually the highest scoring match found by fasta -o is given for 
CDS which show significant similarity to other Ci)S in the database. 
The position of possible ribosome binding site sequences are given 
where these have been used to deduce the initiation codon. 
Gene prediction is based on positional base preference in codons 
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11: CAB66237. putative oxygen-L.[gi: 67 14773] 



Nucleotide, Related Sequences, PubMed, Taxonomy, 
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CAB66237 435 aa linear BCT 10-MAY-2000 

putative oxygen-independent coproporphyrinogen III oxidase. 
[Streptomyces coelicolor A3 (2)]. 
CAB66237 
g6714773 

CAB66237. 1 GI : 6714773 

embl locus SCC77 , accession AL136503. 1 

Streptomyces coelicolor A3 (2). 
Streptomyces coelicolor A3 (2) 

Bacteria; Firmicutes; Actinobacteria; Actinobacteridae; 
Actinomycetales; Streptomycineae; Streptomycetaceae; Streptomyces . 

1 (residues 1 to 435) 

Redenbach,M. , Kieser,H.M., Denapaite, D . , Eichner,A., Cullum, J . , 
Kinashi,H. and Hopwood,D.A. 

A set of ordered cosmids and a detailed genetic and physical map 
for the 8 Mb Streptomyces coelicolor A3 (2) chromosome 
Mol. Microbiol. 21 (1), 77-96 (1996) 
97000351 

2 (residues 1 to 435) 
Oliver, K. and Harris, D. 
Unpublished 

3 (residues 1 to 435) 

Thomson, N.R. , Parkhill,J., Barrell,B.G. and Ra jandream, M. A. 
Direct Submission 

Submitted ( 17- JAN-2000 ) Streptomyces coelicolor sequencing project, 
Sanger Centre, Wellcome Trust Genome Campus, Hinxton, Cambridge 
CB10 ISA E-mail: barrell@sanger.ac.uk Cosmids supplied by Prof. 
David A. Hopwood, [3] John Innes Centre, Norwich Research Park, 
Colney, Norwich, Norfolk NR4 7UH, UK 
Notes : 

Streptomyces coelicolor sequencing at The Sanger Centre is funded 
by the BBSRC and Beowulf Genomics 

Details of S. coelicolor sequencing at the Sanger Centre are 
available on the World Wide Web. 

(URL; http: //www. Sanger ,ac. uk/ Pro jects/S^coelicolor/ ) 
CDS are numbered using the following system eg SC7B7.01c. SC (S. 
coelicolor), 7B7 (cosmid name), .01 (first CDS), c (complementary 
strand) . 

The more significant matches with motifs in the PROSITE database 
are also included but some of these may be fortuitous. 
The length in codons is^ given for each CDS . t > ^...y. 3 w .-. 

;ysuai;^ ±s, t Qiw^#o&^- : 
CDS which show significant similarity to other CDS in the database. 
The position of possible ribosome binding site sequences are given 
where these have been used to deduce the initiation codon. 
Gene prediction is i based on positional base preference in codons 
using a specially developed Hidden Markov Model (Krogh et al., 
Nucleic Acids Research, 22 (22) : 4768-4778 ( 1994 ) ) and the FramePlot 
program of Bibb et al., Gene 30:157-66(1984) as implemented at 
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NCBI Sequence Viewer 



httpi/Avww.ncbi.nlm.nih.gov/entrez/viewer.fcgi?val=CAB66237 . 1 



FEATURES 

source 



Protein 



CDS 



ORIGIN 



// 



http: //www.nih. go. jp/ 

jun/cgi-bin/ frameplot.pl. CAUTION: We may not have predicted the 
correct initiation codon. Where possible we choose an initiation 
codon (atg, gtg, ttg or (att) ) which is preceded by an upstream 
ribosome binding site sequence (optimally 5-13bp before the 
initiation codon) . If this cannot be identified we choose the most 
upstream initiation codon. 

IMPORTANT: This sequence MAY NOT be the entire insert of the 
sequenced clone. It may be shorter because we only sequence 
overlapping sections once, or longer, because we arrange for a 
small overlap between neighbouring submissions. 

Cosmid C77 Lies between and overlaps with cosmids C117 and C123 on 
the Asel-C genomic restriction fragment. 
Location/Qualifiers 
1. .435 

/organism="Streptomyces coelicolor A3 (2)" 
/strain="A3 (2) n 
/db_xref="taxon: 100226" 
/clone="cosmid C77" 
1..435 

/product="putative oxygen-independent coproporphyrinogen 
III oxidase." 
1. .435 

/gene="SCC77.26c" 

/coded_by="complement (AL136503 . 1 : 25494 . . 26801 ) " 
/transl table=ll 

/note="SCC77 . 26c, possible oxygen-independent 
coproporphyrinogen III oxidase (EC 1.-.-.-), len: 435 aa. 
Highly similar to many putative coproporphyrinogen III 
oxidases including: Bacillus subtilis 

SW:HEMN_BACSU(EMBL:X91655) probable oxygen-independent 
coproporphyrinogen III oxidase (366 aa) , fasta scores opt: 
490 z-score: 555.3 E(): 1.5e-23 30.8% identity in 328 aa 
overlap and Mycobacterium tuberculosis 

SW:HEMN_MYCTU(EMBL:Z81368) probable oxygen-independent 
coproporphyrinogen III oxidase (375 aa) , fasta scores opt: 
1358 z-score: 1530.5 E():0 56.5% identity in 382 aa 
overlap. Contains a Prosite hit to PS00017 ATP/GTP-binding 
site motif A (P-loop)." 

1 mngrreraqg tewagdpagc gtmermpsal pdgepvpadg alpasalaga adrplgfylh 
61 vpycatrcgy cdfntytate lrgtggvlas rdnyadtlvd evrlarkvlg ddprevrtvf 
121 vgggtptlla agdlvrmlga irdefglapd aeitteanpe svdpaylatl raggfnrisf 
181 gmqsakqhvl kildrthtpg rpeacvaear aagfdhvnld liygtpgesd ddwrasldaa 
241 Igagpdhvsa yaliveegtq larrirrgev pmtdddvhad ryliaeeals aagydwyevs 
301 nwatsdagrc Ihnelywrga dwwgagpgah shvggvrwwn vkhpgayaga laagkspgag 
361 reiltdedrr verillelrl regvplsllr eaglaasrra lsegllqegp yeagsavltl 
421 rgrlladaw rdlvd 
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alignment_block: / f 
us-09-477-962-115 x US-09-477-962-1 



51,553- W 



Align seg 1/1 to: US-09-477-962-1 from: 1 to: 58857 

1 'MetSerHisAlalleGlifproSerArgLeuIleProAlalleArgGluAl 17 
II Ml II I MM I I Ml I II li Ml Mil I I M I II I I I I I I Ml li II I 
57583 ATGAGCCACGCCATCGGAp CGAGCCGGCTGATCCCCGCCATCCGCGAAGC 57632 

17 aLeuGlyAspGluLysAspProArgLeuAlaLeuTyrValHisValProP 34 
M I M I M I I I I I I M M M M M M M I M M M I I M M M M M M I 
57633 GCTCGGGGACGAGAAGGACCCCCGGCTCGCCCTCTACGTCCACGTCCCCT 57682 

34 heCysSerSerLysCysHisPheCysAspTrpValThrAspIleProVal 50 
I I I I M I I I I M II I II M I I I I M I II I I I I I I I I I I I I I I M I M I I I 
57683 TCTGCTCCTCCAAGTGCCACTTCTGCGACTGGGTCACCGACATCCCCGTC 57732 

51 AlaArgLeuArgGlyAspSerArgGluArgSerProTyrValThrAlaLe 67 
I I I M i I I M 1 I I I I I II I I I I tl I I I I! 1 I f I I I I I M I I I I II I I I I 
57733 GCACGCCTGCGCGGCGACAGCCGGGAACGCTCGCCCTACGTCACCGCCCT 57782 

67 uCysAspGlnlleArgPheTyrGlyProGlnLeuThrArgLeuGlyTyrA 84 
I I II I I I I I I II M M M I I I I M I I I I I I I I M I I I M M I I M I M M 

57783 CTGCGACCAGATCCGCTTCTACGGCCCCCAGCTCACCCGGCTCGGCTACC 57832 

84 rgProGluValMetTyrTrpGlyGlyGlyThrProThrArgLeuThrGly 100 

I I I I II II I I I I I I II I M I I M I II II I I I I I I I I I I I I I I I I M I II I I 
57833 GCCCCGAGGTCATGTACTGGGGCGGCGGCACCCCCACCCGGCTCACCGGC 57882 

101 AspGluMetThrAlaValHisGlnAlaLeuAspAspAlaPheAspLeuTh 117 

I I II I I I I II I I I I I I I I I M I II I I I I I I I I I I I II I I I I I I I I I I I I I 
57883 GACGAGATGACGGCCGTCCACCAGGCCCTCGACGACGCCTTCGACCTGAC 57932 

117 rGlyLeuArgGlnTrpSerValGluSerThrProAsnAspLeuAspProA 134 

I I I I I I I I II I I I I I I I II I I M I I I I II II I I I I II I I I I I I I I I I I I I 
57933 GGGACTCCGCCAGTGGTCGGTGGAGAGCACCCCGAACGACCTCGACCCCG 57982 

134 laThrLeuAspThrLeuArgGlyLeuGlyValThrArgValSerValGly 150 
M I I I II I I II I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I 
57983 CCACCCTCGACACCCTGCGCGGCCTCGGCGTCACCCGCGTCAGCGTCGGC 58032 



151 



167 



ValGlnSerLeuAsnProTyrGlnLeuArgLysAlaGlyArgAlaHisSe 

I I I I I I II I I I I II I I I I I I I I I I I I I I M II I I I II I II I I I I I I I I I I 
58033 GTCCAGTCGCTCAACCCGTACCAGCTGCGCAAGGCAGGCCGGGCCCACTC 58082 

167 rArgGluGlnAlaLeuAlaTVlaValProLeuLeuArgArgAlaGlylleA 184 
I I I I I I I I I I I I I I I I I I I I I I I I ! I I I I I I I I I M I I I I I I I I I I I II I 
58083 GCGCGAACAGGCCCTGGCCGCCGTCCCCCTGTTGCGCCGCGCCGGCATCG 58132 

184 spGluPheAsnValAspLeuIleAlaGlyPheProGlyGluAlaValGlu 200 
: ^VI;I I fill fl II INI \\ I Jill tl I I.I II II I I II M Ml M I I MINI 



201 I Sfer Ph'£^ 217 
^ - ^ I I l lN^^I I Ml 1 1 I I I I I I I Mil I I I I I I I I I 1 11 I I IH Hi ! I I I I v 



217 lSerValTyrProTyrArgAlaThrProLysThrValMetAlaMetGlnL 234 
_ - ^ '^" \\ || I I M I I I I II I M I I I I I I I I I I I I I I I I I i I I I I I I I M I I I I I I 

.-=58233 CTCCGTCTACCCCTACCGCGCCACCCCCAAGACGGTCATGGCCATGCAGC 58282 

234 euAspArgGluPheValGlnAlaArg7VsnArgAspGlyMetIleAspAla 250 

I I I I I I I I I I I I I I II I I I I I I I I I I M I I I I I I I I II I I I I I I II I I I I 
58283 TCGACCGCGAGTTCGTCGAGGCCCGGAACCGGGACGGCATGATCGACGCC 58332 

251 TyrGluArgAlaMetAlaAlaLeuGlyAlaAlaGlyTyrHisGluTyrCy 267 

I I I II I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I 
58333 TATGAACGGGCCATGGCCGCGCTCGGCGCCGCCGGCTATCACGAGTACTG 58382 

267 sHisGlyTyrTrpValArgAspAlaArgHisGluAspGlnAspGlyAsnT 284 

I I I I I II I M I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I 
58383 CCACGGCTACTGGGTGCGCGACGCGCGCCACGAGGACCAGGACGGCAACT 58432 

284 yrLysTyrAspLeuAlaGlyAspLysIleGlyPheGlySerGlyAlaGlu 300 
I I I I I I I I I I I I I I I I I I I I I I I I I Ml I I I I I II I I I I I I I I I I I I I I I 
58433 ACAAGTACGACCTGGCCGGCGACAAGATCGGCTTTGGCAGCGGCGCCGAA 584 82 

301 SerllelleGlyHisHisLeuLeuTrpAsnGluAsnSerAlaTyrAlaAr 317 

I I I I I I I I I I I I I I I I I I I I I I II I I II I I I II I I I I II I I I I I I I II M 
58 483 TCGATCATCGGTCACCACCTGCTCTGGAACGAGAACAGCGCCTACGCCCG 58532 

317 gTyrLeuLeuAlaProArgGluPheSerAlaAlaHisArgPheThrThrA 334 

I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
58533 CTACCTGCTCGCCCCCCGCGAGTTCTCCGCCGCCCACCGGTTCACCACCG 58582 

334 laGluProAspArgLeuThrAlaProValGlyGlyAlaLeuMetThrArg 350 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
58583 CCGAACCCGACCGCCTGACCGCCCCCGTCGGCGGCGCGCTGATGACCCGT 58 632 

q c; i rjuri ,rT^n^i nu ^ a t a k«dv,o7\ *~~7\ mT ^ij'ph rGl^LsuAs^PheA 1 367 
I I I I II I II I I I I I I I I I I Ml I I I I I I Ml I I I II I I I I I I I Ml I I I I 

58633 GAAGGCGTGGTCTTCGCCCGCTTCCGCAGACTGACCGGCCTGGACTTCGC 58682 

367 aAspValArgAlaThrProTyrPheArgGlnTrpPheGluLeuLeuGluA 384 

I I I I II II I II II I I I I II II I I I II I II II I I I I II II I II II II I II I 
58683 GGACGTCCGCGCCACACCGTACTTCCGCCAGTGGTTCGAGCTCCTGGAGC 58732 

384 rgCysGlyGlyArgPheValGluThrProTyrSerLeuArgLeuGluPro 400 

I I M II II II I II I I II I II II I I M I I II I I I II II II I II M I II II I 
58733 GCTGCGGCGGCCGCTTCGTCGAGACGCCGTACAGCCTCCGCCTGGAGCCG 58782 

401 SerThrlleHisArgAlaTyrlleThrHisLeuAlaTyrThrMetAlaHi 417 
I I M M II I I I II II II II II I I I I II I II I II II II I II I II II II I II 
58783 TCCACCATCCACCGCGCCTACATCACCCACCTCGCCTACACCATGGCCCA 58832 

417 sGlyLeuAlaProGluArgAla 424 
Ml II I I I I II! I II I I I I II I 
58833 TGGCCTGGCCCCC 5 8854 . . . ^ ^ : „ ... . \, 



LI ANSWER 1 OF 1 CAPLUS COPYRIGHT 2002 ACS 
AN 2000:332950 CAPLUS 
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AB Polyketides and nonribosomal peptides are assembled in a remarkably 

similar fashion by polyketide synthases (PKSs) from short carboxylic acids 
and nonribosomal peptide synthetases (NRPSs) from amino acids, resp. 
Cloning and sequence anal, of the 90-kb bleomycin (BLM) biosynthesis 
cluster from Streptomyces verticillus ATCC15003 revealed both NRPS and PKS 
genes. By detg. the substrate specificity of individual NRPS and PKS 
modules, a linear hybrid NRPS/PKS/NRPS model is formulated for the Blm 
megasynthetase-templated assembly of BLM from nine amino acids and one 
acetate. These results set the stage for engineering novel BLM analogs by 
genetic manipulation of the blm biosynthesis genes, support the wisdom of 
combining individual NRPS and PKS modules for combinatorial- biosynthesis, 
and lay the foundation to investigate the mol . basis for intermodular 
communication between NRPS and PKS and the mechanism for bithiazole 
biosynthesis . 
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AB The biosynthesis of bleomycin (Blm) in Sv. ATCC15003 has been studied as a 
model to decipher the mechanism of how peptide synthase (PTS) and 
polyketide synthase (PKS) can be hybridized into a functional system to 
make metabolite from amino acids and short fatty acids. A HOkb gene 
cluster for Blm biosynthesis was cloned from Sv. ATCC15003, 75kb of which 
has been fully sequenced and analyzed. Among the many novel discoveries 
made from this study are:. (1) the first model for a hybrid PTS/PKS/PTS 
biosynthetic system, (2) the first example of PKS gene from actinomycetes 
that contains a MT domain, and (3) a novel mechanism for bithiazole 
biosynthesis. These results should lay the foundation for rational 
engineering of hybrid PTS/PKS biosynthetic systems from other peptide and 
polyketide biosynthetic pathways to generate structural diversity. 
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